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UNDER  NORMALITY 

W.  M.  Woods 

Department  of  Operations  Research 

Naval  Postgraduate  School 

Monterey,  CA    93943 


Commander  Wen-Hui  Yang 
Taiwan  Navy 


Closed  form  expressions  of  approximate  lower  confidence  limits  for 
P(X>y)  and  P(X>Y)  are  presented  where  X  and  Y  are  independent  and 
have  normal  probability  distributions  with  unknown  means  and 
variances.  Each  of  the  three  expressions  requires  values  of  percentile 
points  from  the  standard  t  distribution  and  values  of  the  standard 
normal  cumulative  distribution  function  to  compute  the  lower 
confidence  limit  using  a  data  set.  The  expressions  are  shown  to  be  quite 
accurate  for  sample  sizes  of  ten  or  larger. 

KEY  WORDS:    Mechanical  reliability;  interval  estimates. 

1.     INTRODUCTION 

Throughout  this  paper  X  and  Y  are  independent  normally  distributed 

2 

variables  with  unknown  means  fix  and  fj.y  and  unknown  variances  ox  and 

2 
ay.     <&(z)  is  the  standard  normal  CDF,  and  ta/n  is  the  100(l-a)th  percentile 

point  of  the  standard  t  distribution  with  n  degrees  of  freedom.   The  symbol 

tn(p)   denotes   a   noncentral  t  variable  with  n  degrees  of  freedom  and 

2 

noncentrality  parameter  p.   The  terms  X  and  Sx  are  the  sample  mean  and 

2 

sample  variance  of  a  random  sample  of  size  n  on  X.   The  terms  Y  and  Sy  are 
the  sample  mean  and  sample  variance  of  a  random  sample  of  size  m  on  Y. 


_2  2  2 

STp  =  [(n-l)Sx  +  (m-l)Syj  /  (n+m-2)  and  x,  y,  sx,  sy  ,  sp  denote  observed  values 
of  the  corresponding  random  variables. 

If  the  random  strength,  X,  of  a  device  must  exceed  some  worst  case  stress 
value  y,  the  device  reliability  is  often  modeled  as  P(X>y).  If  the  stress  variable, 
Y,  is  also  random  the  device  reliability  is  frequently  modeled  as  P(X>Y)  which 
is  the  average  of  P(X>y)  with  respect  to  the  distribution  of  Y  when  X  and  Y  are 
independent.  In  this  paper  we  present  closed  expressions  for  approximate 
lower  confidence  limits  for  P(X>y)  and  P(X>Y).  In  the  latter  case  one 
expression  is  provided  assuming  ax  =  cy  and  a  different  expression  is  given 
assuming  ox  *  oy 

Several  papers  in  the  literature  present  the  minimum  variance  unbiased 
estimator  for  P(X>y)  when  \ix  and  <jx  are  unknown.  See,  for  example, 
Lieberman  and  Resnikoff  (5),  Folks  and  others  (4),  and  Barton  (1).  Tables  of 
exact  interval  estimates  for  P(X>y)  were  developed  by  Owen  and  Hua  (6)  for 
confidence  levels  of  90%  and  95%.  The  exact  procedure  requires  a  search  in 
tables  of  the  non-central  t  distribution.  Owen  and  Hua  have  performed  this 
task  and  developed  more  convenient  tables.  One  enters  their  tables  using 
confidence  level  1-a,  sample  size  n  and  sample  statistic  value,  (x-y)/s  =  k  to 
obtain  the  lower  confidence  limit  for  P(X>y).  The  lower  limit  values  obtained 
using  our  expression  agree  with  their  values  to  the  nearest  hundredth 
decimal  point  or  better  for  sample  sizes  of  ten  or  larger  and  1  <  k  <  5.  Access 
to  values  of  tam  and  values  of  <P(z)  are  required  to  use  our  closed  form 
procedure  which  is  also  a  function  of  a,  n  and  k. 

If  ox  =  Oy  in  the  two  variable  case,  our  lower  confidence  limit  expression 
for  P(X>Y)  is  analogous  to  the  single  variate  case.   Values  of  tai„  and  <J>(z)  are 


needed  to  use  this  procedure  which  is  a  function  of  sample  sizes  nx,  ny  and 

ix-y)  I  sp. 

Approximate  confidence  limits  for  P(X>Y)  have  been  developed  by 
Church  and  Harris  (2)  when  Y  has  a  standard  normal  distribution.  Downton 
(3)  modified  their  procedure  slightly  to  obtain  more  accurate  limits  and 
suggests  an  approximate  procedure  when  the  means  and  variances  of  both  X 
and  Y  are  unknown. 

2.     SUMMARY  OF  RESULTS 

For  the  single  variable  case  we  seek  a  lower  100(l-a)%  confidence  limit 
for  P(X>y)  =  R(y).  Let  8  =  (fJ.x-y) I crx,  then  R(y)  =  <P(S),  and  a  lower  confidence 
limit  R(y)L  for  R(y)  is  given  by 

*(y)L  =  *(h)  (i) 

where  8l  is  a  lower  confidence  limit  for  8. 

Owen  and  Hua  (6)  used  the  general  confidence  interval  procedure  to  find 
8l-  Specifically  8l  is  the  value  of  8  for  which 

l-a  =  p[(X-y)/S<(x-y)/s] 


=  p\  tn-i{(vx  -  y)^  I  °x)  ^  ^t^^ 


(2) 


Letting  k  =  (x-y)/s,  Owen  and  Hua  develop  tables  of  &(8l)  for  many  values  of 
k  in  [-3,  6],  and  sample  sizes  n  =  2(1)18,  21(3)30,  40(20)100  and  1-a  =  .90  and  .95. 
The  corresponding  values  of  8l  can  easily  be  obtained  from  <P-l(<Pi80). 

In  Section  3  of  this  paper  an  approximate  lower  confidence  limit,  8L,  for  8 
is  shown  to  be 


2 

Values  of  8L  were  compared  with  corresponding  values  of  8l  from  Owen  and 
Hua's  tables  for  numerous  sets  of  (n,k)  to  obtain  a  more  accurate  expression, 
5L.  Let 

'60   if  a  =.05,    10<«<60 
N(a,n)  =  \n     if  a  =.05,    n>60  (4) 

n     if  a  =.10,    »>10 

5i=^{»+^)}2'a-N(a'")         (5) 

The  corresponding  approximate  100(l-a)%  lower  confidence  limit  R*(i/)l  is 

Table  1  displays  values  of  R*(\/)l  and  those  given  by  Owen  and  Hua,  Ri. 

For  the  two  variate  case,  we  seek  a  lower  100(l-a)%  confidence  limit  for 
P(X>Y).  We  first  assume  ax  =  Oy  =  a.  Let  R  =  P(X>V)  and  d-(pix-  Hy)/o-  then 
R  =  0id/y2)  and  a  lower  confidence  limit,  Rl,  for  R  is 

RL  =  4KdL/Jl)  (7) 

It  is  shown  in  Section  3  that  a  lower  100(l-a)%  confidence  limit,  di,  for  d  is 
the  value  of  d  for  which 


l-cr  =  P 


^X-^y)/o-V(l/")  +  (l/m))<(3c-y)/spV(l/")  +  (l/m)]    (8) 


TABLE  1.  APPROXIMATE  (RL)  AND  EXACT  (RL)  CONFIDENCE  LIMITS 

FOR  P(X>y) 


a  =  .10 

a  =  .05 

n                    k 

«1 

Rl 

Rl 

Rl 

10 

1 

.6768 

.6816 

.6334 

.6305 

2 

.8890 

.8913 

.8535 

.8519 

3 

.9736 

.9745 

.9560 

.9556 

4 

.9958 

.9960 

.9903 

.9903 

5 

.9996 

.9996 

.9985 

.9985 

18 

1 

.7298 

.7303 

.6460 

.6950 

2 

.9260 

.9257 

.9040 

.9035 

3 

.9877 

.9875 

.9800 

.9800 

4 

.9988 

.9988 

.9973 

.9973 

5 

.9999 

.9999 

.9998 

.9998 

30 

1 

.7597 

.7594 

.7337 

.7336 

2 

.9431 

.9426 

.9286 

.9285 

3 

.9925 

.9923 

.9885 

.9885 

4 

.9995 

.9994 

.9989 

.9989 

5 

.99998 

.99998 

.99994 

.99994 

60 

1 

.7865 

.7861 

.7688 

.7691 

2 

.9562 

.9559 

.9478 

.9480 

3 

.9954 

.9953 

.9936 

.9937 

4 

.9998 

.9998 

.9996 

.9996 

5 

.99999 

.99999 

.99999 

.99999 

Alternatively,  an  approximate  lower  confidence  limit  dLa  using  a  Taylor 
series  expansion  of  (x-y)/sp  in  the  manner  described  in  Section  3  is 


dla  =  K  -  ((1  /  n)  +  (1  /  m)  +  K2  /  2(n  +  m  -  2))2  t0,m+„_2 


(9) 


where  X  =  (x-y)/sp.  Yang  (8)  used  Monte  Carlo  simulations  to  evaluate  the 
accuracy  of  RLa  =  0{dLJ^l)  for  a  =  .20,  .10,  and  .05.  One  thousand 
replications  of  RLa  were  generated  for  each  parameter  set  (a,  fix,  /J.y,  m,  n). 
Values  of  1  and  20  were  chosen  for  o  then  values  of  iix  and  ny  selected  so  that 
R  =  .90,  .95  and  .99.  For  each  of  these  six  parameter  sets,  three  pairs  of  sample 
sizes  were  chosen  for  a  total  of  eighteen  parameter  sets  for  each  value  of  a. 
The  results  of  the  simulations  are  given  in  Table  2.     If  R.     is  an  exact 


TABLE  2.  ANALYSIS  OF  APPROXIMATE  CONFIDENCE  LIMITS  FOR 
P(X>Y):  EQUAL  VARIANCES  CASE 


R 

n 

m 

a  =  .20                  L                                a  =  .05 
a  =.10 

.900 

8 

8 

.8989,  .801 

.8944,  .906 

.8884,  .960 

8 

30 

.9003,  .798 

.9050,  .888 

.8987,  .952 

20 

30 

.9019,  .788 

.9034,  .888 

.9015,  .947 

.950 

8 

8 

.9498,  .801 

.9467,  .909 

.9406,  .962 

8 

30 

.9511,  .791 

.9529,  .886 

.9489,  .951 

20 

30 

.9516,  .785 

.9517,  .889 

.9500,  .950 

.990 

8 

8 

.9901,  .797 

.9886,  .911 

.9865,  .962 

8 

30 

.9906,  .783 

.9909,  .881 

.9900,  .950 

20 

30 

.9904,  .789 

.9904,  .895 

.9906,  .945 

procedure,  the  values  in  the  column  labeled  ^L10oo(i-a)  snou^  eclual  tne 
values  of  R  in  the  same  row.    The  symbol  p  denotes  the  proportion  of  the 


1000  lower  confidence  limits  that  covered  the  value  of  R.  The  results  were 
the  same  for  o  =  1  and  c  =  20. 

2  2   1/2 

If  we  assume  ox±oy,  let  R  =  P(X>Y)  and  r  =  (fJ.x-Hy)  /  (&  x  +  G  y)      *  ^en 
R  =  <p(r)  and  a  lower  100(l-a)%  confidence  limit  Rl  for  K  is 

RL  =  0(rL)  (10) 

where  rj.  is  a  lower  confidence  limit  for  r.  Let  f  be  defined  by 

1 

r  =  (x-y)/(s2x+s2y)2  (11) 

The  general  method  cannot  be  used  to  find  a  lower  confidence  limit  for  r, 
because  the  statistic  f  cannot  be  modified  to  an  equivalent  statistic  whose 
distribution  is  known  with  r  as  the  only  unknown  parameter.  This  is  the 
same  difficulty  encountered  in  the  well-known  Beherns-Fisher  problem 
associated  with  finding  confidence  intervals  for  fix-ny.  However  an 
approximate  lower  confidence  limit  ri  for  r  can  be  found  by  using  a  Taylor 
series  expansion  of  f  and  fitting  a  t  distribution  to  the  distribution  of  r  using 
random  degrees  of  freedom,  v,  which  is  computed  from  the  data.  This 
procedure    is    developed    in    Section    3.       The    expression    for  rtais 

ria  =  r~  [s2x  /n  +  s2y  /m)/(s2x  +s2y)  +  r2{s4x  /(n-l)  +  sj  /  (m-\))/ l{s2x  +s*f]2ta/U 

(12) 
where 

v  =  (s2x+s2lf  / (s4x  / (n-l)  +  s$  / (m-lj)  (13) 

The  method  used  in  this  procedure  to  resolve  the  Beherns-Fisher  type  of 
difficulty  is  similar  to  that  employed  by  Welch  (7).  The  expression  for  v  in 
equation  (11)  is  different  from  that  used  by  Welch. 


Yang  (8)  has  used  Monte  Carlo  simulations  to  evaluate  the  accuracy  of 
this  procedure  given  by  Equations  (8),  (9),  (10)  and  (11).  The  results  of  his 
simulations  are  given  in  Table  3.  The  symbol  p  denotes  the  proportion  of  the 
1000  randomly  generated  lower  confidence  limits  that  covered  the  value  of  R. 


TABLE  3.  ANALYSIS  OF  APPROXIMATE  CONFIDENCE  LIMITS  FOR 
P(X>Y)  FOR  UNEQUAL  VARIANCES  CASE 

R 

ox 

GY 

n 

in 

a  =.20 

9 

Ki,1000(l-a)/P 

a  =  .10 

a  =  .05 

.900 

1.0 

2.0 

10 

20 

.9008,  .796 

.9008,  .897 

.8959,  .956 

25 

35 

.8996,  .803 

.8993,  .901 

.8988,  .963 

75 

50 

.8997,  .801 

.8990,  .907 

.8998,  .950 

10.0 

40.0 

10 

20 

.9019,  .791 

.9022,  .889 

.9009,  .947 

25 

35 

.9012,  .790 

.9005,  .897 

.8980,  .954 

75 

50 

.9000,  .800 

.8995,  .902 

.8992,  .952 

.950 

1.0 

2.0 

10 

20 

.9500,  .800 

.9497,  .901 

.9461,  .955 

25 

35 

.9502,  .798 

.9502,  .899 

.9482,  .959 

75 

50 

.9496,  .808 

.9493,  .903 

.9500,  .948 

10.0 

40.0 

10 

20 

.9513,  .784 

.9510,  .896 

.9517,  .945 

25 

35 

.9514,  .782 

.9494,  .902 

.9470,  .955 

75 

50 

.9487,  .810 

.9494,  .904 

.9500,  .950 

.990 

1.0 

2.0 

10 

20 

.9902,  .793 

.9899,  .906 

.9888,  .955 

25 

35 

.9903,  .789 

.9899,  .901 

.9891,  .961 

75 

50 

.9898,  .811 

.9899,  .905 

.9899,  .952 

10.0 

40.0 

10 

20 

.9906,  .777 

.9901,  .898 

.9903,  .942 

25 

35 

.9907,  .775 

.9904,  .893 

.9892,  .955 

75 

50 

.9899,  .803 

.9898,  .906 

.9900,  .949 

8 


3.     ANALYSIS 

For  the  single  variate  case  we  fit  a  t  distribution  to  the  distribution  of  the 
statistic  (x-y)/sx.  If  g(x,s)  =  (x-y)/sx  =  k  is  expanded  in  a  Taylor  series  about 
(M*/0";c)/  and  the  subscript  x  is  dropped,  then 

g{x,s)  =  (fi- y) I s  +  (x - /z) /  o-(s- cj)(m- y) / s2 . 


Then 


E(k)  =  E[g(x,s)]  =  (n  - y)(4 - 3Vl-l/(2«-l)) /  G 

=  (M-y)/cr     ifn>8,  (14) 


and 


°\  =  var [g%  s)]  =  var(x )  /  o-2  +  ((/i  -  y )  /  y 2  J  var(s) 

=  (l/»)  +  ((^-y)/o-)2/2(»-l)  (15) 

2 

Let  ak  =  (1/n)  +  k2/2(n-\).  It  is  easily  shown  that  k  is  a  consistent 
estimator  for  ((j.  -  y)  /  <j  =  8  and  for  large  n  the  distribution  of  k  will  be 
approximately  normal.  Consequently  an  approximate  lower  confidence  limit 
for  (jj.  -  y) I  g  would  be  k  -  6k  Za.  We  choose  instead  to  approximate  the 
distribution  of  k  with  a  central  t  distribution  with  n  -  1  degrees  of  freedom 
and  thus  obtain  the  approximate  lower  100(1  -  <x)%  confidence  limit,  8ia,  for 
8  as 

1 

^a=^-((l/")  +  ^2/2(«-l))2^n_1  (16) 


As  discussed  in  Section  2,  the  tables  developed  by  Owen  and  Hua  can  be  used 

to  modify  the  right  member  of  Equation  (12)  to  obtain  the  more  accurate 
lower  confidence  limit  8La  given  by 

1 

5la=k-((l/n)  +  k2/2(n-    l))2*a,N(a,n)  (17) 

where  N(a,n)  is  defined  in  Equation  (4). 

In  the  two  variable  cases  when  ox  =  ay,  we  fit  a  t  distribution  to  the 
distribution  of  the  statistic  (x  -  y)/sp.  Expanding  g(x-y,  Sp)  =  (x-y)/sp  =  K  in  a 
Taylor  series  about  (fdx  -  iiy,o)  and  collecting  terms  we  get 

K  =  s(*-y'Sp)  =  [(^-/iy)/0-]+   (*-y-(Mx-My))/0-  -(s2-CT2)(^-My)/2cr3 


Then 


E{K)  =  ^X    ^y  (18) 

G 

var{K)±((l/n))  +  (l/m))  +  (nx-Hyf  /2a2(n  +  m-2)  (19) 


Let 


aK  =  {(l/n)  +  (l/m)  +  K2  /2{n  +  m-2)Y  (20) 

Proceeding  in  a  manner  similar  to  the  single  variate  case  we  obtain  the 
approximate  lower  confidence  limit,  diiOU  for  d  m  (fix-fiy)/a  given  by 

dLa  =  K- ((1  /  n)  +  (1  /  m)  +  K2  /  2(n  +  m - 2))2 ta^m_2  (21) 

and 

RLa  =  ®{dLa  I V*)  (22) 
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When  cx^Oy  we  fit  a  t  distribution  to  the  distribution  of  the  statistic 

2      2    1/2  2  2 

f  =  (x-y)/(sx+sy)      .  Expanding  f  in  a  Taylor  series  about  (/zx  -  fjy,  gx  +G  y)  and 
collecting  terms  we  get 

Consequently 

EW*^  (23) 


var(f)  =  var(x-y)/(a?  +  oJ)  +  (/2x-/iy)   var(s*  +sJ)/4(o^  +  o£) 
=  ((<j2/n  +  o2/m)/(o?  +  o^^ 

(24) 
Let 


o"f  = 


(s2/w  +  s2/m)/(s2  +  s2)  +  (x-y)2(s^/(«-l)  +  s^/(m-l)/2(s2+s2) 

(25) 
An      approximate      100(l-a)%     lower     confidence     limit,     ri,    for 

(Mx'fy)/(<0%  +  O%)       =ris 

rLa=r-°fta,v  (26) 

where  u  is  some  appropriate  degrees  of  freedom.   Then 

Rl<x  =  <?*rLa)  (27) 

The  following  method  for  finding  an  "appropriate"  degrees  of  freedom  is 
used  only  to  develop  an  expression  for  the  degrees  of  freedom,  u,  that  makes 
equation  (23)  sufficiently  accurate.   If  a  strategy  similar  to  that  used  in  the  two 
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previous  interval  methods  is  used  to  find  an  approximate  confidence  limit 
for  r,  we  need  to  find  the  approximate  degrees  of  freedom  for  an  equivalent 
expression  for  f  which  looks  something  like  a  noncentral  t  distribution.  Yang 
[91  used  a  different  approach  to  the  following  procedure,  but  obtained  the 
same  result.    The  expression 


f_     X-y      _         j^x  +  Vy  Vgg+qy 


2        2  2        2 

SX    +  Sy  \     SX    +  Sy 


suggests,  but  is  not,  a  noncentral  t  distribution  with  noncentrality  parameter 
r.    If  we  were  to  use  a  t  distribution  to  fit  the  distribution  of  f,  the  radicand 

2  2  2 

;y)/((Tx  +  oy) 


2        2  2  2  2 

(sx  +  sy)/(ox  +  cv)  should  be  of  the  form  %clc.  Therefore 


^+s2)/(o^  +  o^)  =  *c2  (29) 

Using  the  properties  varf^l  =  2c,var(s2)  =  lot  /(n-l),var(sy)  =  lOy  /(m-\) 
and  taking  the  variance  of  both  sides  of  equation  (26),  we  have 

c2(2(jx-/{n-l)  +  2<j*/{m-l))/(<72x  +  <jlf=2c 

Solving  for  c  we  get 

c  =  {o2x  +  G2yf/{ox-/(n-l)  +  o4y/(m-lj) 


Finally  we  let 


v  =  c  =  (s2x+s2yf/(s4x/(n-l)  +  s4y/(m-lj)  (30) 
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4.     CONCLUSIONS 

We  have  derived  closed  form  expressions  for  approximate  lower 
confidence  limits  for  the  reliability  of  a  device  when  reliability  is  modeled  as 
P(X>y)  or  P(X>Y)  where  X  and  Y  are  independent  normally  distributed 
variables  with  unknown  means  and  variances.  The  expressions  have  been 
evaluated  for  accuracy  and  demonstrated  to  be  quite  accurate  when  sample 
sizes  are  larger  than  ten.  The  three  confidence  limit  expressions  presented  are 
easy  to  compute  and  can  be  programmed  on  some  existing  hand-held 
calculators.  Percentile  values,  ta>n,  of  the  standard  t  distribution  and  values  of 
the  standardize  normal  cumulative  distribution  function,  <£(z),  are  required 
for  each  expression  to  compute  the  lower  confidence  limit  values  for  given 
data  sets. 

In  mechanical  reliability  settings,  X  usually  denotes  strength  and  Y 
denotes  stress.  However  both  reliability  models,  P(X>Y)  and  P{X>y),  have 
numerous  applications  when  X  and  Y  denote  times  to  first  occurrence  of 
events. 
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